[ET-VK] Add cooperative matrix dispatch for quantized linear by xuyanwen2012 · Pull Request #19892 · pytorch/executorch

xuyanwen2012 · 2026-05-30T00:18:48Z

Summary

Adds KHR cooperative-matrix dispatch for quantized linear on the Vulkan backend, extending the fp16 coopmat path from #19009 to quantized weights:

4-bit weight — linear_q4gsw_coopmat (fp16 act × INT4 weight) and linear_dq8ca_q4gsw_coopmat (8-bit dynamic act × INT4 weight)
8-bit weight — linear_dq8ca_q8csw_coopmat (8-bit dynamic act × INT8 weight), plus its tiled V_DOT4 fallback and op registration

Coopmat is gated on Adapter::supports_cooperative_matrix(), a wave64 subgroup, buffer output storage, half dtype, and M/N/K tile alignment. Ineligible shapes — including any with a bias — fall back to the existing tiled shaders.

Review order

QuantizedLinear.cpp (dispatch gate can_use_q4gsw_coopmat) → the linear_*_coopmat.glsl shaders → op_registry.py / custom_ops_lib.py / patterns/quantized_linear.py (registration) → the custom-op tests.

Test plan

Built against main with EXECUTORCH_BUILD_VULKAN=ON; ran the custom_ops prototyping tests on an AMD Radeon 780M (RDNA3, wave64):

test_q4gsw_linear: 72/72 correctness pass
test_dq8ca_q8csw_linear: 22/22 correctness pass

Per the existing convention, fp16 (coopmat-only) correctness is not asserted against the fp32 CPU reference (the fp16 round-trip diverges at near-zero / overflowing elements); the coopmat path is exercised via build + dispatch + perf.

Open questions (draft)

Reachability: the coopmat path requires buffer output storage, so it only fires when the partitioner selects buffer storage for the linear's output. Feedback welcome on the preferred way to make it reachable in a typical export

Adds coopmat shaders and dispatch for 4-bit (q4gsw, dq8ca_q4gsw) and 8-bit (dq8ca_q8csw) quantized linear, gated on Adapter::supports_cooperative_matrix(), wave64 subgroup size, buffer output storage, and coopmat tile alignment — mirroring the fp16 coopmat path from pytorch#19009. Ineligible shapes fall back to the existing tiled shaders. Review order: QuantizedLinear.cpp for the dispatch gate (can_use_q4gsw_coopmat), then the linear_*_coopmat.glsl shaders, op_registry.py / custom_ops_lib.py / patterns for registration, then the tests.

pytorch-bot · 2026-05-30T00:18:52Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19892

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

linux-foundation-easycla · 2026-05-30T00:18:55Z

❌ - login: @xuyanwen2012 / name: Yanwen Xu. The commit (05a4f50) is not authorized under a signed CLA. Please click here to be authorized. For further assistance with EasyCLA, please visit our EasyCLA portal and chat with our support bot.

github-actions · 2026-05-30T00:19:32Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ET-VK] Add cooperative matrix dispatch for quantized linear#19892

[ET-VK] Add cooperative matrix dispatch for quantized linear#19892
xuyanwen2012 wants to merge 1 commit into
pytorch:mainfrom
sarc-acl:yanwen/quant-dev

xuyanwen2012 commented May 30, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented May 30, 2026

Uh oh!

linux-foundation-easycla Bot commented May 30, 2026

Uh oh!

github-actions Bot commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xuyanwen2012 commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Review order

Test plan

Open questions (draft)

Uh oh!

pytorch-bot Bot commented May 30, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19892

Uh oh!

linux-foundation-easycla Bot commented May 30, 2026

Uh oh!

github-actions Bot commented May 30, 2026

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

xuyanwen2012 commented May 30, 2026 •

edited

Loading

This PR needs a `release notes:` label